Glitch in the matrix 2.0 - ISITDTU 2022 Quals - Writeup

Introduction

This is a crazy challenge that drained my team's brains šŸ˜¢ and we couldn't solve it in time of the CTF šŸ™ After 2 days of trying to brainstorm, we have finally be able to crack the code, with a little bit of luck šŸ˜—

Since the intended solution will be provided by the author catto after the finals (18/12), which is over ONE MONTH AFTER from when I'm writing these lines (why it gotta take so long catto šŸ˜­ šŸ˜­ šŸ˜­), I decided to create my own writeup to explain our method of solving this problem šŸ˜—

Table of contents

Challenge

Description

The server generates a securely random 64-bit hex string password that is kept secret. We as the clients have to guess the secret password to get the flag.

The server gives us hints about the password through the output of the encrypt(password), which is a function we can call as many times as we can, each time it seemingly produces a different output despite feeding the same input to the function.

This writeup provides a way to obtain the flag with 8 outputs of encrypt(password) in a probability of about 1/40.

Code

Analyzing stuffs...

Let's first dissect everything in the source code to see what it does and how to make sense of the data given to us šŸ˜— If you've already known what to expect, you can hop onto the next section to see the discussion of the solution ā¤ļø

encrypt(message) converts message in hex form to its binary form M as a list of 0 and 1s,

 

For each bit of M, it generates a random 64-bit value C (also a list of 0 and 1s),

 

calls f(C, [:64]) or f(C, BASIS[64:]) depending on the current bit of M, appends the result of the call to the ct variable.

 

BASIS is a global variable, a 2-dimensional list with size 128x128 with unknown content whose rows are shuffled in every call to encrypt().

 

After loop, ct is then converted to its hex form and returned to the user. From this code, it is safe to assume that the value of ct is a list consisting of all 0s and 1s.

 

Also, it can be deduced from ct and the type hints and operations of f(), which is just bunch of repeated XORs of values in v andM[i] (M is BASIS[:64] or BASIS[64:]), that BASIS also consists of 128x128 values of 0s and 1s.

 

All of these XORs and 2-dimensional arrays and lists of 0s and 1s seem to point everything into viewing those things as addition of matrices and vectors in GF(2). And it turns out we can view:

Now we know how to view the encrypted password, let's discuss the solution!

 

How to know the bits of password?

In order to get all the bits of password, we must be able to classify which vector viā†’ in ct belongs to which vector space. Is viā†’ generated from BASIS[:64] (so passwordi is 0), or BASIS[64:] (so passwordi is 1)?

We don't even have BASIS, so if we know a particular vector is formed by a set of 64 rows in BASIS, we can't even tell if that set positions at the top of the matrix or the bottom! The only bet for us is to know which bit is different from which bit by comparing the vector space that spans the corresponding vectors. If we've got the mapping of which bit is different from which bit, we can set one random bit to be 0 and let the mapping does the rest. This gives us the probability of finding the correct answer at 1/2 (given the mapping, of course šŸ˜ƒ)

Notations

Before moving on, I think I should introduce some notations here to help discussing the next part.

 

Matrix1.0

In Glitch in the matrix 1.0, comparing passwordi and passwordj is very much doable. To solve this challenge, we have to provide a way to differentiate between the vectors that are either generated from BASIS[:64] or BASIS[64:].

Because password doesn't change every-time we call encrypt and most importantly, BASIS is not changed in the entire runtime of the program, vi,aā†’ is generated from the same set of 64 vectors as vi,bā†’ for any a,b, or Si,a=Si,b.

This means that we can request 64 encrypts to obtain vi,0ā†’,vi,1ā†’,vi,2ā†’,...,vi,63ā†’ that is in the same vector space as BASIS[:64] or BASIS[64:]. In order to verify if passwordi=passwordj, or Si,a=Sj,a, I created a 128x128 matrix that looks like this:

[vi,0ā†’vi,1ā†’...vi,62ā†’vi,63ā†’vj,0ā†’vj,1ā†’...vj,62ā†’vj,63ā†’]

If the rank of this matrix is 64, vi,0ā†’,vi,1ā†’,vi,2ā†’,...,vi,63ā†’,vj,0ā†’,vj,1ā†’,vj,2ā†’,...,vj,63ā†’ comes from the same span of 64 vectors, so Si,a=Sj,a, and we can conclude that passwordi=passwordj. If the rank of this matrix is 128, the opposite is true and we get passwordiā‰ passwordj.

In script, we fix i=0 and let j runs from 1āˆ’63 to generate the mapping and recover the password with probability of 1/2.

 

Matrix2.0

Glitch in the matrix 2.0 is the updated version where BASIS is shuffled every-time we request a new encrypt(password). This means that the property that:

vi,aā†’ is generated from the same set of 64 vectors as vi,bā†’ where aā‰ b, or Si,a=Si,b,

is not correct this time around, which is very sad šŸ˜­. We also cannot refer to BASIS[:64] and BASIS[64:] as constants anymore.

Probability hint

After a while, a hint came out from the author:

And we still had no idea what the hell she is talking about until the end of the CTF ā˜¹ļø

...

And then, a thought comes to my mind:

While it's true that vi,aā†’ is not gonna be generated from the same set of 64 vectors as vi,bā†’ if aā‰ b, some of the vectors in those sets might overlap. Let P(Si,aāˆ©Si,b={}) be the probability of Si,b purposefully choosing a set of different 64 vectors to Si,a in a list of 128 vectors, then:

P(Si,aāˆ©Si,b={})=1/C12864=4.17e-38

proving that this is a very, very, very, very, very unlikely event.

This gives me an insight that the sets Si,a and Si,b almost always overlap, and from above we've got:

P(#(Si,aāˆŖSi,b)ā‰¤127)=1āˆ’1/C12864

This might be the "probability" thing catto was talking about! Also from this another idea is formed!

Detecting āŒŠ128nāŒ‹+1 similar bits in password

Let's say we generate encrypt(password) n times for nā‰„3.

And somehow, for a certain value of passwordi, we obtain the vectors vi,0ā†’,vi,1ā†’,...,vi,nāˆ’1ā†’ such that:

#(Si,0āˆŖSi,1āˆŖSi,2 āˆŖ ... āˆŖSi,nāˆ’1)ā‰¤127

This means that Si,0āˆŖSi,1āˆŖSi,2 āˆŖ ... āˆŖSi,nāˆ’1 contains about 127 rows in BASIS except one/some row(s). The remaining row(s) of BASIS would be present in the set of Sj,0āˆŖSj,1āˆŖSj,2 āˆŖ ... āˆŖSj,nāˆ’1 where passwordjā‰ passwordi as Sj,a always contains the vectors that Si,a doesn't have.

 

And somehow, we can find a set of āŒŠ128nāŒ‹+1 elements {passwordi0,passwordi1,...,passwordiāŒŠ128nāŒ‹}, where passwordi0=passwordi1= ...=passwordiāŒŠ128nāŒ‹.

 

So, this matrix M, which has n(āŒŠ128nāŒ‹+1)>128 rows,

M=[vi0,0ā†’vi0,1ā†’...vi0,nāˆ’1ā†’......viāŒŠ128nāŒ‹,0ā†’viāŒŠ128nāŒ‹,1ā†’...viāŒŠ128nāŒ‹,nāˆ’1ā†’]

must have rank ā‰¤127 as all of these vectors are generated from the set of Si,0āˆŖSi,1āˆŖSi,2 āˆŖ ... āˆŖSi,nāˆ’1, which contains only ā‰¤127 rows of BASIS.

 

If we change just one of passwordik to a different bit from the rest, the above matrix M will most likely have higher rank (often 128) than before, since there will be a row in M that is a linear combination of the vectors including the one(s) that is/are not in Si,0āˆŖSi,1āˆŖSi,2 āˆŖ ... āˆŖSi,nāˆ’1.

 

Here comes our mechanism to detect āŒŠ128nāŒ‹+1 similar bits in password. In particular, if we randomize {i0,i1,...,iāŒŠ128nāŒ‹} enough times and construct M, we might stumble across a situation where M has rank ā‰¤127 and we can use this to say that passwordi0=passwordi1= ...=passwordiāŒŠ128nāŒ‹ with high probability.

 

However, in order for this to work, we need to make sure to choose n so that both of these conditions hold:

  1. #(Si,0āˆŖSi,1āˆŖSi,2 āˆŖ ... āˆŖSi,nāˆ’1)ā‰¤127 occurs in high probability.
  2. Choose {i0,i1,...,iāŒŠ128nāŒ‹} that passwordi0=passwordi1= ...=passwordiāŒŠ128nāŒ‹=0 in high probability.

 

It turns out, if scenario 1. occurs more likely, scenario 2. will occur less likely and vice versa, so we need to balance n. Through trials and errors, I've decided to choose n=8 and the correct set {i0,i1,...,iāŒŠ128nāŒ‹} is found in a probability of about 1/20, after brute forcing at most 400000 sets.

Recover the remaining bits in password

Remember the matrix M from the last section? We can use it to detect the rest of the bits in password. For every j-th bit in password that jāˆ‰{i0,i1,...,iāŒŠ128nāŒ‹}, I decided to construct a new matrix N from M that is:

N=[Mvj,0ā†’vj,1ā†’...vj,nāˆ’1ā†’]

Then compares rank(N):

With this information, we can recover password with a probability of 1/2.

Solve code

This piece of code implements what I've just described above, with some small changes: